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Even more interesting than the intricate organization of complex networks are the dynamical 
behavior of systems which such structures underly. Among the many types of dynamics, one partic- 
ularly interesting category involves the evolution of trails left by moving agents progressing through 
random walks and dilating processes in a complex network. The emergence of trails is present in 
many dynamical process, such as pedestrian traffic, information flow and metabolic pathways. Im- 
portant problems related with trails include the reconstruction of the trail and the identification 
of its source, when complete knowledge of the trail is missing. In addition, the following of trails 
in multi-agent systems represent a particularly interesting situation related to pedestrian dynamics 
and swarming intelligence. The present work addresses these three issues while taking into account 
permanent and transient marks left in the visited nodes. Different topologies are considered for 
trail reconstruction and trail source identification, including four complex networks models and four 
real networks, namely the Internet, the US airlines network, an email network and the scientific 
collaboration network of complex network researchers. Our results show that the topology of the 
network influence in trail reconstruction, source identification and agent dynamics. 

PACS numbers: 89.75.Hc,89.75.Fb,89.70.+c 



when you have eliminated the impossible, whatever 
remains, however improbable, must be the truth. ' (Sir A. 
C. Doyle, Sherlock Holmes) 



I. INTRODUCTION 

Complex networks have become one of the leading 
paradigms in science thanks to their ability to represent 
and model highly intricate structures (e.g., [H, 0, H, 0]). 
However, as a growing number of works have shown (e.g., 
0,111) the dynamics of systems whose connectivity is de- 
fined by complex networks is often even more complex 
and interesting than the connectivity of the networks 
themselves. One particularly interesting type of non- 
linear dynamics involves the evolution of trails left by 
moving agents during random walks or dilation processes 
along the network . The term "dilation" refers to the pro- 
gressive visiting of neighboring nodes after starting from 
one or more nodes. For instance, starting from node i, at 
each subsequent time the neighbors of i are visited, then 
their unvisited neighbor, and so on, defining a hierarchi- 
cal system of neighborhoods (e.g. 0, d, @])- Although 
the dynamics is being described as agents visiting net- 
work sites, it can be considered also as the evolution on 
activity in the nodes of the network, where each network 
edge represents the possibility of activity propagation be- 
tween the respective nodes. Another important related 
problem involves attempts to recover incomplete trails. 
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In other words, in cases in which only partial evidence is 
available to observation, it becomes important to try to 
infer the full set of visited nodes. 

The emergencey of trails has been studied as repre- 
senting an interesting type of self-organizational system. 
Helbing et al. [|[ proposed a model of pedestrian motion 
in order to explore the evolution of trails in urban green 
areas. Also, trails have been considered in swarming in- 
telligence analysis d, IToll not only as a means to under- 
stand animal behavior [Tl[ , but also as a source of insights 
for new optimization and routing algorithms [HI, EH- 
These works considered the evolution of trails in regu- 
lar grids. However, the communication structures where 
the trail can be defined are not homogeneous in many 
cases. Many systems, such as the Internet [5|, social re- 
lationships [lj], the distribution of streets in cities [l5[ 
and the connections between airports jT(| are defined by 
a irregular topology — more specifically, most of these 
systems are represented by scale-free networks [4]. Here, 
we study the influence of different topologies in trails re- 
covery, source identification and agent dynamics. 

The analysis of trails left in complex networks can 
have many useful applications. For instance, in informa- 
tion networks the recovery of the trail left by a spread- 
ing virus on the Internet can be useful to identify the 
source of contamination and propose strategies for com- 
puter immunization. Similarly, the identification of the 
origin of rumors, diseases, fads and opinion formation [ItJ 
are important to understand the human communication 
dynamics. Another relevant problem is related to traffic 
improvement and security. In the former case, identifica- 
tion of the covered trails by packages exchanged between 
computers can help the development of optimal routing 
paths. In the latter, the source of terrorism strategies 
and drug trafficking can be determined by analysis of 
clues identified in social and airline networks. The anal- 
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ysis of trails can also have useful applications in biology. 
For instance, in ecology, trails analysis can be applied 
to quantify the interference of human activity in animal 
behavior and to identify focus of pollution. In paleontol- 
ogy, the recovery of the trails of animal displacement by 
fossil analysis can help the understanding of diversifica- 
tion between species. In epidemiology, the identification 
of disease source can help to stop the spreading process 
as well as to devise effective prevention strategies. 

In order to properly represent trails occurring in com- 
plex networks, we associate state variables to each node 
i, i = 1, 2, . . . , N, of the network. The trail is then de- 
fined by marking such variables along the respective dy- 
namical process. Only trails generated by self-avoiding 
random walks and dilations are considered in the cur- 
rent work, which are characterized by the fact that a 
node is never visited more than once. We restrict our 
attention to binary trails, characterized by binary state 
variables [25| . The types of trails can be further classified 
by considering the marks to be permanent or transient. 
In the latter case, the mark associated to a node can be 
deleted after the visit. While many different transient 
dynamics are possible, we restrict our attention to the 
following two types: (i) Poissonian, where each mark has 
a fixed probability of being removed after the visit; and 
(ii) Evanescent, where the only observable portion of the 
trail correspond to the node(s) being currently visited. 

The current work addresses the problem of re- 
covering trails in complex networks and identifying 
their origin, while considering permanent and tran- 
sient binary marks in four different networks models, 
namely Erdos-Renyi, Watts-Strogatz, Barabasi- Albert, 
and Dorogovtsev-Mendes-Samukhin models; and four 
real networks: the Internet at the Autonomous System 
level, the US airlines network, an email network from the 
University Rovira i Virgili and the scientific collabora- 
tion network of complex network researchers. We also 
consider the analysis of agents propagation considering 
the four networks models. The next sections start by 
presenting the basic concepts in complex networks and 
trails and follow by reporting the simulation results, with 
respective discussions. 



II. BASIC CONCEPTS IN COMPLEX 
NETWORKS AND TRAILS 

An undirected complex network (or graph) G is de- 
fined as G — (V, Q), where V is the set of N nodes and 
Q is the set of E edges of the type {i,j}, indicating that 
nodes i and j are bidirectionally connected. Such a net- 
work can be completely represented in terms of its ad- 
jacency matrix K, such that the presence of the edge 
{i,j} is indicated as K(i,j) = K(j,i) = 1 [otherwise 
K(i,j) = K(j,i) = 0]. The degree of a node i corre- 
sponds to the number of edges connected to it, which 
can be calculated as k(i) = J2j=i The clustering 

coefficient is related to the presence of triangles (cycles 



of length three) in the network [18]. The clustering co- 
efficient of a node i is given by the ratio between the 
number of edges among the neighbors of i and the max- 
imum possible number of edges among these neighbors; 
the clustering coefficient of the network is the average of 
the clustering coefficient of its nodes. 

This article considers four theoretical network mod- 
els and four real complex networks. The network mod- 
els are (a) Erdos-Renyi — ER (b) Watts-Strogatz 
- WS Hi, (c) Barabasi- Albert - BA | and (d) 
Dorogovtsev-Mendes-Samukhin — DMS [2(| • In the first 
model, networks are constructed by considering constant 
probability A of connection between any pair of nodes; 
in the second, networks start with a regular topology, 
whose nodes are connected in a ring to a defined number 
k of neighbors in each direction, and later the edges are 
rewired with a fixed probability; networks of the third 
and fourth models are grown by starting with mo nodes 
and progressively adding new nodes with m edges, which 
are connected to the existing nodes with probability pro- 
portional to their degrees (e.g., []]]). The DMS model 
differs from the BA model by adding an initial attractive- 
ness fco to each node, independent of its degree. When 
fc = 0, the DMS model is similar to the BA model [20| . 
All simulations considered in this work assume that the 
networks have the same number of nodes N = 1000 and 
average degree (k) = 2m = X(N — 1) = 2k = 4. The real 
networks considered in this work are the Internet at the 
level of autonomous systems (26[, the US Airlines [2l| . 
the e-mail network from the University Rovira i Virgili 
(Tarragona) [22j and the scientific collaboration of com- 
plex networks researchers [27| . 

Trails are generated while subsets of the nodes V are 
visited during the evolution of random walks or dilations 
through the network. We assume that just one trail is 
allowed at any time in a complex network. We consider 
only self-avoiding random walks, in which no node is vis- 
ited more than once. At each node, the agent chooses 
a new node to be visited at random among the not yet 
visited neighbors of the node. To understand the dila- 
tion process, consider v{i) the set of neighbors of node 
i. Starting with i a , the initial node of the propagation 
(origin), all nodes in z^(zo) are visited; after that, for all 
j G v{io), the nodes in v(j) not yet visited are recur- 
sively visited; this process is repeated for agiven number 
of neighborhood hierarchies (e.g., [1, H, 0, HH ) ; see Fig.Q] 

In order to represent trails, we associate two binary 
state variables v(i) and s(i) to each node i, which can 
take the values (not yet visited) or 1 (visited). The 
state variables v(i) indicate the real visits to each node 
but are available only to the moving agents, the state 
variables s(i) are the "marks" of the visits yet available 
for observation, providing not necessarily complete infor- 
mation about the visits. The structure of the network is 
assumed to be known to the observer and possibly also 
to the moving agent (s). Such a situation corresponds to 
many real problems. For instance, in case the trail is 
being defined as an exploring agent moves through un- 
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FIG. 1: Dilating trail with two levels in a simple network. 
The origin of this two-hierarchy-trail is the black node, whose 
immediated neighbors are marked in gray. The nodes with 
the crossed pattern correspond to the neighbors of the neigh- 
bors of the source of the trail. The respective evanescent trail 
would include only the crossed nodes. A Poissonian version 
of this trail would imply a ratio 7 of unmarked (and unob- 
servable) nodes. 



TRAILS (Random Walks or Dilations) 




Permanent 



Transient 



Poissonian Evanescent 



FIG. 2: Trails, including those defined by random walks and 
dilations, can be subdivided as being permanent or transient. 
The latter type can be further subdivided into Poissonian and 
Evanescent. 
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FIG. 3: The three state variables associated with each net- 
work node i and the defined errors e, £ and p. 



to the sum of the state variables v(i). The observable 
extension of a trail is equal to the sum of the state vari- 
ables s(i). Given a trail, we can define the observation 
error as being equal to 



N 



(1) 



where S(a, b) is the Kronecker delta function, yielding one 
in case a = b and zero otherwise. Note that this error 
measures the incompleteness of the information provided 
to the observer. It is also possible to normalize this error 
by dividing it by N, so that < e < 1; this normalization 
is not used in this work. 

It is assumed that the observer will try to recover the 
original, complete, trail from its observation. In this case, 
the observer applies some heuristic in order to obtain a 
recovered trail specified by an additional set of state vari- 
ables r(i) (r(i) = 1 if node i is in the recovered trail). 
Such a heuristic may take into account the overlap er- 
ror between the observable states s(i) and the recovered 
values r(i), defined as 



N 



e = £[i -«(*(<),»•(*))]■ 



(2) 



known territory, the agent may keep some visited places 
marked with physical signs (e.g., flags or stones) which 
are accessible to observers, while keeping a complete map 
of visited sites available only to her/himself. Trails are 
here classified as permanent or transient. In the case of 
permanent trails, s(i) = v(i), i.e. all visited nodes are 
known. In the transient type, the state variables s(i) 
of each node i can be reset to zero after being visited. 
Transient trails can be further subdivided into: (i) Pois- 
sonian, characterized by the fact that each visited node 
has a fixed probability 7 of not being observed, i.e. for 
nodes with v(i) — 1, s(i) is 1 with probability 1 — 7 and 
with probability 7 (nodes with v(i) = always have 
s(i) = 0); and (ii) Evanescent, where only the last vis- 
ited nodes are accessible to the observer. Figure [5] shows 
a classification of the main types of trails considered in 
this work. 

The real extension of a trail is defined as being equal 



Note that as the observer has no access to v(i), the re- 
covery error has to be estimated using s(i). The actual 
recovery error, which can be used to infer the quality of 
the recovery, is given by 



A' 



P 



= £[l-W),Ki))]- 



(3) 



Figure [3] illustrates the three state variables related to 
each network node and the respectively defined errors. 

When using recovery heuristics based on the evaluation 
of the overlap error, it may happen that two or more dif- 
ferent recovered trails yield the same overlap error. In 
this case, it is interesting to consider two additional pa- 
rameters in order to quantify the quality of the recovery: 
(i) the number M of estimated trajectories correspond- 
ing to the minimum overlap error; and (ii) the fraction / 
of times that the correct source can be found among the 
M recovered trails. When average values of M and / are 
close to 1, it means that the recovery strategy is precise. 
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III. CONSIDERED PROBLEMS 

Although the problem of trail analysis in complex net- 
works is potentially very rich and can be extended to 
many possible interesting situations, for simplicity's sake 
we restrict our interest to the three following cases: 

Poissonian trails from random walks: Because the 
consideration of permanent and evanescent trails 
left by random walks are trivial (28l |. we concen- 
trate our attention on the problem of recovering 
Poissonian trails left by single moving agents dur- 
ing random walks. Once such a trail is recovered, 
its source can be estimated as corresponding to one 
of its two extremities; we do not consider the prob- 
lem of source identification for this kind of trail. 
The recovery error is used to measure the quality 
of the reconstructed trail. 

Poissonian trails from dilations: In this case, only a 
fraction of the nodes visited by the dilating process 
is available to the observer. Two problems are of 
interest here, namely recovering the trail and iden- 
tifying its origin. To quantify the quality of the re- 
covery, we evaluate the average values of the num- 
ber of trails with minimal overlap error (M) and 
the fraction of correct source identifications (/}. 

Evanescent trails from dilations: In this type of 
problem, only the currently visited nodes are avail- 
able to the observer, which is requested to recon- 
struct the trail and infer its possible origin. This 
corresponds to the potentially most challenging of 
the considered situations. Note that this case too is 
subject to random removal of marks, i.e. the values 
of s(i) are not only of the evanescent type but sub- 
jected to be randomly changed to 0. The results 
are evaluated by computing (M) and (/). 



IV. STRATEGIES FOR RECOVERY AND 
SOURCE IDENTIFICATION 

Several heuristics can be possibly used for recovering 
a trail from the information provided by K and s(i). In 
this work, we consider a strategy based on the topological 
proximity on the network between nodes with s(i) = 1 
that are not connected. In the case of trails left by ran- 
dom walks, the following algorithm is used: 

1. Initialize a list r as being equal to s; 

2. For each node i with s(i) = 1: 

(a) identify the node j with r(i) = 1 which is con- 
nected to at most one other node with r(i) = 1 
and is closest to i (in the sense of shortest 
topological path, but excluding shortest paths 
with length or 1 in the network); 




FIG. 4: Example of simple Poissonian trail in a network. The 
black nodes correspond to s, the original trail included the 
black and gray nodes. 



(b) obtain the list L of nodes linking i to j through 
the respective shortest path (if more than one 
shortest path exist, one of them is chosen at 
random) ; 

(c) for each node k in L, make r(k) = 1. 

After all nodes with s(i) = 1 have been considered, the 
recovered trail will be given by the nodes with r(i) = 1. 

Figure 2] illustrates a simple Poissonian random walk 
trail, where the black nodes are those in s. The original 
trail is composed of the nodes in s plus the gray nodes. 
It can be easily verified that the application of the above 
reconstruction heuristic will properly recover the original 
trail in this particular case. More specifically, we would 
have the following sequence of operations: 

Step 1: node 1 connected to node 5 through the shortest 
path (1,2,3,5); 

Step 2: node 2 connected to node 5 (no effect); 

Step 3: node 5 connected to node 2 (no effect); 

Step 4: node 9 connected to node 5 through the shortest 
path (9,8,6,5). 

However, if the dashed edge connecting nodes 9 and 
10 were included into the network, a large recvery error 
would have been obtained because the algorithm would 
link node 9 to node 1 or 2 and not to node 5. 

A different strategy is used for recovery and source 
identification in the case of dilation trails, which involves 
repeating the dilation dynamics while starting from each 
of the network nodes. The most likely recovered trails 
are those corresponding to the smallest obtained overlap 
error. Note that more than one trail may correspond to 
the smallest error. Also, observe that the possible trail 
sources are simultaneously determined by this algorithm. 
Actually, it is an interesting fact that complete recovery 
of the trail is automatically guaranteed once the original 
source is properly identified. This is an immediate conse- 
quence of the fact that the recovery strategy involves the 
reproduction of the original dilation, so that the original 
and obtained trails for the correct source will necessarily 
be identical. 
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Some additional remarks are required in order to clar- 
ify the reason why more than one trail can be identified 
as corresponding to the minimal overlap error in Poisso- 
nian dilation trails. Figure [5] illustrates a simple network 
with two trails extending through two hierarchies, one 
starting from the source A and the other from B, which 
are respectively identified by the vertical and horizontal 
patterns. Note that some of the nodes are covered by 
both trails, being therefore represented by the crossed 
pattern. Now, assume that the original trail was left by 
A but that the respectively Poissonian version only in- 
corporated the three nodes with thick border (i.e. all the 
other nodes along this trail were deleted before presenta- 
tion to the observer). Because the three nodes are shared 
by both trails, the same overlap error will be obtained by 
starting at nodes A or B. It is expected that the higher 
the value of 7, the more ambiguous the source identifica- 
tion becomes. 

When many possible recovered trails with the same 
overlap error are found, i.e. when M > 1, the identi- 
fication of the source is ambiguous. To take this fact 
into account, in that cases we consider that each of the 
possible sources is as good as the other, and therefore 
can be used as the evaluated source; therefore we make 
/ = l/M. 



V. SIMULATION RESULTS AND DISCUSSION 

To evaluate the recovery strategies under different 
topologies, randomly generated trails are studied in the 
ER, SW, BA, and DMS network models and the networks 
of Internet (AS), US Airlines, e-mail and scientific collab- 
oration, as indicated previously. The following sections 
present e discuss those results. 



FIG. 6: The observation error (black squares) and the re- 
covery errors (white circles) obtained by using the recovery 
algorithm for for Poissonian trails from random walks in the 
(a) ER, (b) SW, (c) BA and (d) DMS network models. 



A. Network models 

Each considered network model is formed by N = 1 000 
nodes and average degree {k) = 4. All random walk 
trails were Poissonian with real extent equal to 20 nodes 
and 7 = 0.1, 0.2, . . . , 0.8. All dilation trails took place 
along 2 hierarchies, while the respective Poissonian and 
evanescent cases assumed 7 = 0.1, 0.2,. ..,0.8. In order 
to provide statistically significant results, each configu- 
ration (i.e. type of network, trail and 7) was simulated 
100 times. The rewiring probability in WS model is the 
same as in ER model, i.e. p = (k) /(N — 1). The initial 
connectivity in DMS networks models is k^ = 5. 

Figure [5] shows the average observation and recovery 
errors, with respective standard deviations, obtained for 
the Poissonian random walk trails in the four considered 
network models. The figure indicates an almost linear 
increase of the recovery error with 7. Such a monotonic 
increase is explained by the fact that the higher the value 
of 7, the more incomplete the observable states become. 
As the recovery of trails with more gaps will necessarily 
imply more wrongly recovered patches, the respective er- 
ror therefore will increase with 7. Also, as can be seen 
by a comparison between observation and recovery errors, 
the adopted recovery heuristic allowed moderate results 
for all considered network models, without significative 
differences among the models, which suggests that such a 
recovery strategy is independent of the network topology. 

Figure[7|gives the average and standard deviation of M 
for Poissonian dilation trails corresponding to the mini- 
mal overlap error £ for ER, SW, BA and DMS networks. 
In all of these models, the average and standard devi- 
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FIG. 7: The average and standard deviations, in terms of 7, 
of the number M of detected trails corresponding to the min- 
imal overlap error with respect to Poissonian dilation trails 
obtained for ER (a), SW (b), BA (c) and DMS (d) network 
models. 



FIG. 8: The average (and standard deviations) of the flag / 
indicating that the correct source has been identified among 
the detected trails with minimal overlap error £ in the recovery 
of Poissonian dilation trails for ER (a) , SW (b) , BA (c) and 
DMS (d) network models. 



ation values of M tend to increase with 7, starting at 
(M) = 1. This effect is a consequence of the fact that, the 
more sparse the information about the real trail, the more 
likely it is to cover the observable states s with dilations 
starting from different nodes. Interestingly, the increase 
of (M) is substantially more accentuated for ER net- 
works, and BA networks are the least subject to source 
determination ambiguities. 

For the Poissonian dilation trails, the average (/) (and 
standard deviation) of the flag / is given in terms of 7 
in Figure M for ER, SW, BA, and DMS networks. It 
is clear from these results that the average number of 
times, along the realizations, in which the correct source 
is identified among those trails corresponding to the min- 
imal overlap error £ tends to decrease with 7. This is a 
direct consequence of the fact that higher values of 7 im- 
ply substantial distortions to the original trail, ultimately 
leading to shifts in the identification of the correct source. 
The behavior of (/} is similar for ER, BA and DMS net- 
work models, with a sharp decrease for 7 > 0.3. For SW 
networks, on the other hand, (/) has a smooth decrease. 
The sources of the trails are best identified for ER, BA 
and DMS when 7 < 0.3. For higher values of 7, the 
sources are best identified for SW network models. 

Finally, we turn our attention to transient dilation 
trails of the evanescent category. Recall that in this type 
of trails only the current position of the trail (i.e. its bor- 
der) is available to the observer. Figure [5] presents the 
average and standard deviation of M obtained, in terms 
of 7, for the ER, SW, BA and DMS network models. The 
result is similar to the case of Poissonian trails (Fig. [7]) , 
with the recovery strategy having the worst results for 
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FIG. 9: The average and standard deviations, in terms of 7, 
of the number M of detected evanescent trails corresponding 
to the minimal overlap error obtained for ER (a), SW (b), 
BA (c) and DMS (d) network models. 



ER networks, and similar results among the other mod- 
els. But for the evanescent trails M grows more gradually 
than for Poissonian trails. 

Figure |TD] shows the average and standard deviation of 
the values of the flag / in terms of 7 obtained for the same 
models. Again, the results are similar to those obtained 
for the Poissonian trails (Fig. [5]), but with a more gradual 
decrease of / for the ER model. 
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FIG. 10: The average (and standard deviations) of the flag / 
indicating that the correct source has been identified among 
the detected evanescent trails with minimal overlap error £ 
for ER (a), SW (b), BA (c) and DMS (d) network models. 



Remarkably, though retaining less information about 
the original trail than the respectively Poissonian coun- 
terparts, the evanescent trails tend to allow a similar 
identification of the source of the trail and the original 
trail. 



TABLE I: Statistical measurements for the considered real 
networks. TV is the number of nodes, (ft) is the average degree, 
cc is the average clustering coefficient. 



Network 



TV (ft) cc 



Internet 3,522 3.59 0.19 

USA Airlines network 332 12.81 0.62 
Collaboration in science 1,589 3.45 0.02 
E-mail network 1,133 19.24 0.19 
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B. Real networks 

We considered four different networks in our simula- 
tions, namely: the Internet at the level of autonomous 
systems, the US Airlines [5l[, the e-mail network from 
the University Rovira i Virgili (Tarragona) [12] and the 
scientific collaboration of complex networks researchers. 
Table |T] presents some information about these networks. 
All random walk trails were Poissonian with real extent 
equal to 20 nodes and all dilation trails took place along 
2 hierarchies, with 7 = 0.1, 0.2, . . . , 0.8. Figure QT] shows 
average recovery errors, obtained for the Poissonian ran- 
dom walk trails in the four considered real networks. 
Again, as we observed for the networks models, the re- 
covery error increases almost linearly with 7, being only 
slightly smaller than the observation error. The adopted 
recovery method achieves slightly better results for the 
US airlines network than for the other networks. 

Figure [12] gives the average and standard deviation of 
M for trails corresponding to the minimal overlap er- 
ror £ for Poissonian dilation trials in the considered real 
networks. The value of (M) tends to increase with 7 
for all networks. For the Internet, (M) has two distinct 
behavior: (i) for 7 < 0.4 and 7 > 0.6 , (M) increases 
slowly, (ii) for 0.4 < 7 < 0.6, (M) decreases; in the region 
7 < 0.5, M has high standard deviations. In the case of 



FIG. 11: The observation (black squares) and recovery (white 
circles) errors obtained by using the recovery algorithm for 
Poissonian trail from random walks, for (a) the Internet, 
(b) the USA airlines, (c) the e-mail network from the Uni- 
veristy Rovira i Virgili, and (d) the scientific collaboration of 
complex networks researchers. 



the US Airlines and the scientific collaboration network, 
(M) has a similar behavior, but has larger values than 
from the US Airlines. The smallest values of (M) are 
obtained for the e-mail network. Therefore, trails can 
be better recovery in this type of network, which is an 
important discovery because it has implications for the 
identification of the source of spreading of virus or ru- 
mors, among other cases. 

The average (/) of the correct source identification flag 
(and standard deviation) is given in terms of 7 in Fig- 
ure [T31 for the considered real networks. The source iden- 
tification is worst for the Internet. 

For transient dilation trails of the evanescent category, 
the results are shown in Figure Q3] (for M) and Figure [TBI 
(for /). As for the models, the results are close to those 
obtained considering Poissonian dilation trails, despite 
the fact that the evanescent category provides less infor- 
mation for trail recovery. 
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FIG. 12: The average and standard deviations, in terms of 7, 
of the number M of detected trails corresponding to the min- 
imal overlap error obtained in the case of Poissonian dilation 
trails for (a) the Internet, (b) the USA airlines, (c) the e-mail 
network from the Univeristy Rovira i Virgili, and (d) the sci- 
entific collaboration of complex networks researchers. 



FIG. 14: The average and standard deviations, in terms of 
7, of the number M of detected evanescent trails correspond- 
ing to the minimal overlap error obtained (a) the Internet, 
(b) the USA airlines, (c) the e-mail network from the Uni- 
veristy Rovira i Virgili, and (d) the scientific collaboration of 
complex networks researchers. 
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FIG. 13: The average (and standard deviations) of the flag / 
indicating that the correct source has been identified among 
the detected Poissonian trails with minimal overlap error £ 
(a) the Internet, (b) the USA airlines, (c) the e-mail network 
from the Univeristy Rovira i Virgili, and (d) the scientific 
collaboration of complex networks researchers. 



FIG. 15: The average (and standard deviations) of the flag / 
indicating that the correct source has been identified among 
the detected evanescent trails with minimal overlap error £ 
(a) the Internet, (b) the USA airlines, (c) the e-mail network 
from the Univeristy Rovira i Virgili, and (d) the scientific 
collaboration of complex networks researchers. 



VI. MULTI-AGENTS 

We considered the dynamics of multi-agents on trail 
evolution considering four complex networks models: 
ER, SW, BA and DMS. Each considered network model 



is formed by N = 1 000 nodes and average degree (k) = 4. 
The process is defined as follows: (i) the first agent leaves 
a gradient trail — the current position has the strongest 
mark and the source, the weakest — by self avoid ran- 
dom walks, (ii) the path is erased with a probability 7 
(Poissonian trail as before), (hi) the second agent tries to 
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FIG. 16: The average and standard deviations in terms of the 
length of the path covered by the second agent obtained for 
ER (a), SW (b), BA (c) and DMS (d) network models. Each 
point is an average of 500 realizations. 



reach the target (the last vertex of the trail) by following 
preferentially the strongest, at each immediate neighbor- 
hood, the marks left by the first agent. When the second 
agent does not find any mark, it performs a random walk 
until another mark is found. This process is performed 
for example by ants in searching of food — the first agent 
can represent an ant that leaves a trail of pheromone that 
will be followed by the second ant. The objective of our 
investigation is to determine the influence of the topol- 
ogy in target identification efficiency, as well as possible 
overall trajectory minimization, by measuring the length 
of the path covered by the second agent. All random 
walk trails were Poissonian with real extent equal to 20 
nodes and 7 = 0.1, 0.2, . . . , 0.8. Figure \W\ presents the 
length of the path covered by the second agent in func- 
tion of the erasing rate 7. As can be clearly seen, when 
7 < 0.5 the second agent covers smallest paths for BA, 
SW and DMS network models, followed by the ER. This 
suggests that the topology of the network is fundamental 
for trajectory following. Indeed, the hubs present in BA 
and DMS network models provide shortcuts through the 
network. Enhanced efficiency was also found for the SW 
network models, but the high clustering coefficient was 
identified as being fundamental in this case. While the 
length of the path obtained by the second agent is kept 
almost constant when 7 increases for ER, it increases in 
the remainder models. For 7 > 0.5, the length of the path 
for ER reaches its smallest value. Therefore, when the 
trail is almost complete, the BA, SW and DMS topolo- 
gies provide the best performances, but when the trail is 
sparse, ER allows the shortest paths. Thus, the topology 
was verified to strongly influence agent dynamics. 



VII. CONCLUDING REMARKS 

Great part of the interest in complex networks has 
stemmed from their ability to represent and model in- 
tricate natural and human-made structures ranging from 
the Internet to protein interaction networks. There is a 
growing interest in the study of dynamics in such systems 
(e.g., @, d, HI]). Among the many types of interesting 
dynamics which can take place on complex networks, we 
have the evolution of trails left by moving agents dur- 
ing random walks and dilations. In particular, given 
one of such (possibly incomplete) trails, immediately im- 
plied problems involve the recovery of the full trail and 
the identification of its possible source. Such problems 
are particularly important because they are directly re- 
lated to a large number of practical and theoretical situ- 
ations, including fad and rumor spreading, epidemiology, 
exploration of new territories, transmission of messages 
in communications, amongst many other possibilities. 

The important problem of analyzing trails left in 
networks by moving agents during random walks and 
dilations has been formalized and investigated by 
using two heuristic algorithms in the present arti- 
cle. We considered four models of complex networks, 
namely Erdos-Renyi, Barabasi- Albert, Watts- Strogatz, 
and Dorogovtsev-Mendes-Samukhin models, and four 
different real networks: the Internet at the level of au- 
tonomous systems, the US Airlines, the e-mail network 
from the University Rovira i Virgili (Tarragona) and the 
scientific collaboration of complex networks researchers. 
Also, we considered two types of trails: permanent and 
transient. Particular attention was given to trails with 
transient marks. In the case of random walk trails, we in- 
vestigated how incomplete Poissonian trails can be recov- 
ered by using a shortest path approach. The recovery and 
identification of source of dilation trails was approached 
by reproducing the dilating process for each of the net- 
work nodes and comparing the obtained trails with the 
observable state variables. 

It has been shown through simulation that both such 
strategies are potentially useful for trail reconstruction 
and source identification. In addition, a series of inter- 
esting results and trends have been identified. First, it 
has been found that the shortest path approach for recov- 
ery of trails left by random walks provides similar results 
for all considered networks and network models, which 
suggests that such strategy independes on the network 
topology. Second, for dilatation trails it was found that 
the Poissonian and evanescent types of trails allow similar 
efficiency in the identification of sources, despite the fact 
that the latter trails incorporate less information than 
the former. 

The analysis of multi-agents on networks showed 
that the topology strongly influences the respective 
performance. When the trail is almost complete, 
the Barabasi- Albert, Watts-Strogatz and Dorogovtsev- 
Mendes-Samukhin network models provide the best per- 
formance. On the other hand, when the information 
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about the trail is sparse, the final point of the trail is 
reached faster for the Erdos-Renyi network model. 

It is believed that the suggested methods and exper- 
imental results have paved the way to a number of im- 
portant related works, including the investigation of the 
scaling of the effects and trends identified in the present 
work to other network sizes, average node degrees and 
network models. At the same time, it would be interest- 
ing to consider graded state variables, more than a sin- 
gle trail taking place simultaneously in a network, other 
types of random walks (e.g., preferential [24j), as well as 
alternative recovery and source identification strategies. 



One particularly promising future possibility regards the 
recovery of diffusive dynamics in complex networks. 
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